17 research outputs found

    Uncertainty modeling and interpretability in convolutional neural networks for polyp segmentation

    Get PDF
    Convolutional Neural Networks (CNNs) are propelling advances in a range of different computer vision tasks such as object detection and object segmentation. Their success has motivated research in applications of such models for medical image analysis. If CNN-based models are to be helpful in a medical context, they need to be precise, interpretable, and uncertainty in predictions must be well understood. In this paper, we develop and evaluate recent advances in uncertainty estimation and model interpretability in the context of semantic segmentation of polyps from colonoscopy images. We evaluate and enhance several architectures of Fully Convolutional Networks (FCNs) for semantic segmentation of colorectal polyps and provide a comparison between these models. Our highest performing model achieves a 76.06% mean IOU accuracy on the EndoScene dataset, a considerable improvement over the previous state-of-the-art

    Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series

    Get PDF
    Deep learning-based support systems have demonstrated encouraging results in numerous clinical applications involving the processing of time series data. While such systems often are very accurate, they have no inherent mechanism for explaining what influenced the predictions, which is critical for clinical tasks. However, existing explainability techniques lack an important component for trustworthy and reliable decision support, namely a notion of uncertainty. In this paper, we address this lack of uncertainty by proposing a deep ensemble approach where a collection of DNNs are trained independently. A measure of uncertainty in the relevance scores is computed by taking the standard deviation across the relevance scores produced by each model in the ensemble, which in turn is used to make the explanations more reliable. The class activation mapping method is used to assign a relevance score for each time step in the time series. Results demonstrate that the proposed ensemble is more accurate in locating relevant time steps and is more consistent across random initializations, thus making the model more trustworthy. The proposed methodology paves the way for constructing trustworthy and dependable support systems for processing clinical time series for healthcare related tasks.Comment: 11 pages, 9 figures, code at https://github.com/Wickstrom/TimeSeriesXA

    Auroral Image Classification With Deep Neural Networks

    Get PDF
    Results from a study of automatic aurora classification using machine learning techniquesare presented. The aurora is the manifestation of physical phenomena in the ionosphere-magnetosphereenvironment. Automatic classification of millions of auroral images from the Arctic and Antarctic istherefore an attractive tool for developing auroral statistics and for supporting scientists to study auroralimages in an objective, organized, and repeatable manner. Although previous studies have presentedtools for detecting aurora, there has been a lack of tools for classifying aurora into subclasses with ahigh precision ( > 90%). This work considers seven auroral subclasses: breakup, colored, arcs, discrete,patchy, edge, and faint. Six different deep neural network architectures have been tested along with thewell-known classification algorithms: k-nearest neighbor (KNN) and a support vector machine (SVM).A set of clean nighttime color auroral images, without clearly ambiguous auroral forms, moonlight,twilight, clouds, and so forth, were used for training and testing the classifiers. The deep neural networksgenerally outperformed the KNN and SVM methods, and the ResNet-50 architecture achieved the highestperformance with an average classification precision of 92%.</p

    The Meta-Evaluation Problem in Explainable AI: Identifying Reliable Estimators with MetaQuantus

    Full text link
    Explainable AI (XAI) is a rapidly evolving field that aims to improve transparency and trustworthiness of AI systems to humans. One of the unsolved challenges in XAI is estimating the performance of these explanation methods for neural networks, which has resulted in numerous competing metrics with little to no indication of which one is to be preferred. In this paper, to identify the most reliable evaluation method in a given explainability context, we propose MetaQuantus -- a simple yet powerful framework that meta-evaluates two complementary performance characteristics of an evaluation method: its resilience to noise and reactivity to randomness. We demonstrate the effectiveness of our framework through a series of experiments, targeting various open questions in XAI, such as the selection of explanation methods and optimisation of hyperparameters of a given metric. We release our work under an open-source license to serve as a development tool for XAI researchers and Machine Learning (ML) practitioners to verify and benchmark newly constructed metrics (i.e., ``estimators'' of explanation quality). With this work, we provide clear and theoretically-grounded guidance for building reliable evaluation methods, thus facilitating standardisation and reproducibility in the field of XAI.Comment: 30 pages, 12 figures, 3 table

    A clinically motivated self-supervised approach for content-based image retrieval of CT liver images

    Get PDF
    Deep learning-based approaches for content-based image retrieval (CBIR) of computed tomography (CT) liver images is an active field of research, but suffer from some critical limitations. First, they are heavily reliant on labeled data, which can be challenging and costly to acquire. Second, they lack transparency and explainability, which limits the trustworthiness of deep CBIR systems. We address these limitations by: (1) Proposing a self-supervised learning framework that incorporates domain-knowledge into the training procedure, and, (2) by providing the first representation learning explainability analysis in the context of CBIR of CT liver images. Results demonstrate improved performance compared to the standard self-supervised approach across several metrics, as well as improved generalization across datasets. Further, we conduct the first representation learning explainability analysis in the context of CBIR, which reveals new insights into the feature extraction process. Lastly, we perform a case study with cross-examination CBIR that demonstrates the usability of our proposed framework. We believe that our proposed framework could play a vital role in creating trustworthy deep CBIR systems that can successfully take advantage of unlabeled data

    Advancing Deep Learning with Emphasis on Data-Driven Healthcare

    Get PDF
    Retten til helse er en grunnleggende menneskerettighet, men mange utfordringer står overfor de som ønsker å etterleve denne retten. Mangel på utdannet helsepersonell, økte kostnader og en aldrende befolkning er bare noen få eksempler på nåværende hindringer i helsesektoren. Å takle slike problemer er avgjørende for å gi pålitelig helsehjelp med høy kvalitet til mennesker over hele verden. Mange forskere og helsepersonell mener at et datadrevet helsevesen har potensial til å løse mange av disse problemene. Datadrevne metoder er basert på algoritmer som lærer å utføre oppgaver ved å identifisere mønstre i data, og forbedres ofte i takt med at mer data blir samlet inn. En sentral drivkraft i moderne datadrevet helsevesen er dyp læring. Dyp læring er en del av representasjonslæringsfeltet, hvor målet er å lære en datarepresentasjon som er gunstig for å utføre en gitt oppgave. Dyp læring har ført til store forbedringer i viktige helsedomener som bilde og språkbehandling. Imidlertid mangler dyplæringsalgoritmer tolkbarhet, gir ikke uttryk for usikkerhet, og har vanskeligheter når de får i oppgave å lære fra data uten menneskelige annoteringer. Dette er grunnleggende begrensninger som må adresseres for at et datadrevet helsevesen basert på dyp læring skal nå sitt fulle potensial. For å takle disse begrensningene foreslår vi ny metodikk innen dyp læring. Vi presenterer de første metodene for å fange opp usikkerhet i forklaringer av prediksjoner, og vi introduserer det første rammeverket for å forklare representasjoner av data. Vi introduserer også en ny metode som utnytter domenekunnskap til å trekke ut klinisk relevante attributer fra medisinske bilder. Vårt fokus er på helseapplikasjoner, men den foreslåtte metodikken kan brukes i andre domener også. Vi tror at innovasjonene i denne oppgaven kan spille en viktig rolle i å skape pålitelige dyplæringsalgoritmer som kan lære av umerkede data

    Auroral Image Classification With Deep Neural Networks

    Get PDF
    Results from a study of automatic aurora classification using machine learning techniques are presented. The aurora is the manifestation of physical phenomena in the ionosphere‐magnetosphere environment. Automatic classification of millions of auroral images from the Arctic and Antarctic is therefore an attractive tool for developing auroral statistics and for supporting scientists to study auroral images in an objective, organized, and repeatable manner. Although previous studies have presented tools for detecting aurora, there has been a lack of tools for classifying aurora into subclasses with a high precision (>90%). This work considers seven auroral subclasses: breakup, colored, arcs, discrete, patchy, edge, and faint. Six different deep neural network architectures have been tested along with the well‐known classification algorithms: k‐nearest neighbor (KNN) and a support vector machine (SVM). A set of clean nighttime color auroral images, without clearly ambiguous auroral forms, moonlight, twilight, clouds, and so forth, were used for training and testing the classifiers. The deep neural networks generally outperformed the KNN and SVM methods, and the ResNet‐50 architecture achieved the highest performance with an average classification precision of 92%

    Mixing up contrastive learning: Self-supervised representation learning for time series

    Get PDF
    The lack of labeled data is a key challenge for learning useful representation from time series data. However, an unsupervised representation framework that is capable of producing high quality representations could be of great value. It is key to enabling transfer learning, which is especially beneficial for medical applications, where there is an abundance of data but labeling is costly and time consuming. We propose an unsupervised contrastive learning framework that is motivated from the perspective of label smoothing. The proposed approach uses a novel contrastive loss that naturally exploits a data augmentation scheme in which new samples are generated by mixing two data samples with a mixing component. The task in the proposed framework is to predict the mixing component, which is utilized as soft targets in the loss function. Experiments demonstrate the framework’s superior performance compared to other representation learning approaches on both univariate and multivariate time series and illustrate its benefits for transfer learning for clinical time series
    corecore